Approximate Range-Sum Queries over Data Cubes Using Cosine Transform
نویسندگان
چکیده
In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells’ values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness. Keywords—DCT, Data Cube
منابع مشابه
Relative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes
Range sum queries on data cubes are a powerful tool for analysis. A range sum query applies an aggregation operation (e.g., SUM) over all selected cells in a data cube, where the selection is specified by providing ranges of values for numeric dimensions. Many application domains require that information provided by analysis tools be current or "near-current." Existing techniques for range sum ...
متن کاملSelectivity Estimation of Range Queries Based on Data Density Approximation via Orthonormal Series
Selectivity estimation is an integral part of query optimization. In this paper, we propose to approximate data density functions of relations and use the approximations to estimate selectivities of range queries. A data density function here is approximated by a partial sum of an orthonormal series. Compared with histogram-based approaches, such as the wavelet, DCT, and kernel-spline methods, ...
متن کاملOn the Optimality of the Greedy Heuristic in Wavelet Synopses for Range Queries
In recent years wavelet based synopses were shown to be effective for approximate queries in database systems. The simplest wavelet synopses are constructed by computing the Haar transform over a vector consisting of either the raw-data or the prefix-sums of the data, and using a greedy-heuristic to select the wavelet coefficients that are kept in the synopsis. The greedy-heuristic is known to ...
متن کاملRange queries in dynamic OLAP data cubes
A range query applies an aggregation operation (e.g., SUM) over all selected cells of an OLAP data cube where the selection is speci®ed by providing ranges of values for numeric dimensions. Range sum queries on data cubes are a powerful analysis tool. Many application domains require that data cubes are updated often and the information provided by analysis tools are current or ``near current''...
متن کاملWavelet-based relative prefix sum methods for range sum queries in data cubes
Data mining and related applications often rely on extensive range sum queries and thus, it is important for these queries to scale well. Range sum queries in data cubes can be achieved in time O(1) using prefix sum aggregates but prefix sum update costs are proportional to the size of the data cube O ( nd ) . Using the Relative Prefix Sum (RPS) method, the update costs can be reduced to the ro...
متن کامل